• Data-intensive computing infrastructure systems for unmodified biological data analysis pipelines 

      Bongo, Lars Ailo; Pedersen, Edvard; Ernstsen, Martin (Journal article; Tidsskriftartikkel; Peer reviewed, 2015-11-18)
      Biological data analysis is typically implemented using a deep pipeline that combines a wide array of tools and databases. These pipelines must scale to very large datasets, and consequently require parallel and distributed computing. It is therefore important to choose a hardware platform and underlying data management and processing systems well suited for processing large datasets. There are many ...
    • Mario. A system for iterative and interactive processing of biological data 

      Ernstsen, Martin (Master thesis; Mastergradsoppgave, 2013-11-15)
      This thesis address challenges in metagenomic data processing on clusters of computers; in particular the need for interactive response times during development, debugging and tuning of data processing pipelines. Typical metagenomics pipelines batch process data, and have execution times ranging from hours to months, making configuration and tuning time consuming and impractical. We have analyzed ...